DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
نویسندگان
چکیده
In semistructured databases there is no schema fixed in advance. To provide the benefits of a schema in such environments, we introduce DataGuides: concise and accurate structural summaries of semistructured databases. DataGuides serve as dynamic schemas, generated from the database; they are useful for browsing database structure, formulating queries, storing information such as statistics and sample values, and enabling query optimization. This paper presents the theoretical foundations of DataGuides along with an algorithm for their creation and an overview of incremental maintenance. We provide performance results based on our implementation of DataGuides in the Lore DBMS for semistructured data. We also describe the use of DataGuides in Lore, both in the user interface to enable structure browsing and query formulation, and as a means of guiding the query processor and optimizing query execution.
منابع مشابه
Approximate DataGuides
DataGuides are concise and accurate summaries of semistructured databases, enabling schema exploration and improving query processing. Unfortunately, DataGuides can be very expensive to compute, especially for large, cyclic databases. For many DataGuide uses, an “approximate” summary of the database' s structure can be beneficial yet much cheaper to compute. We summarize several uses of DataGui...
متن کاملSummarizing and Searching Sequential Semistructured Sources
XML, the eXtensible Markup Language [XML97], is fast becoming the de-facto representation for semistructured data. In the research community, initial work on semistructured databases was based on simple graphbased data models such as the Object Exchange Model (OEM) [PGMW95]. Though XML and OEM are similar, there are some differences [DFF99, GMW99], and one of the most significant of these conce...
متن کاملFrom Semistructured Data to XML: Migrating the Lore Data Model and Query Language
Research on semistructured data over the last several years has focused on data models, query languages, and systems where the database is modeled as some form of labeled, directed graph [Abi97, Bun97]. The recent emergence of eXtensible Markup Language (XML) as a new standard for data representation and exchange on the World-Wide Web has drawn significant attention [BPSM98]. Researchers have c...
متن کاملQuery Optimization for Semistructured Data
With the emerging prevalence of semistructured data|data that may be irregular or incomplete|it is important to develop e cient query processing techniques for such data. This paper describes the query processor of Lore, a DBMS for semistructured data, and focuses particularly on the cost-based query optimization techniques we have developed and implemented for a semistructured environment. Whi...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997